Coordinated multi‐agent hierarchical deep reinforcement learning to solve multi‐trip vehicle routing problems with soft time windows
نویسندگان
چکیده
Vehicle Routing Problem (VRP) is a widespread problem in the transportation field, which challenges intelligent level of vehicle decisions. Multi-Trip with Time Windows (MTVRPTW), as further evolved VRP considering multiple departures from one depot and temporal constraint visiting nodes, has developed into critical issues scheduling logistics, bus transit, railway, aviation. Traditionally, MTVRPTW solved by heuristic algorithm, generally time-consuming non-steady results. Reinforcement learning (RL) multi-agent framework have become popular solving to get better performance. However, lack variant dimensions searching space knowledge exchange between agents inhibit improvement algorithms. Therefore, Coordinated Multi-agent Hierarchical Deep Learning (CMA-HDRL) method proposed this study enhance overall solution quality convergence rate constructing three-layered structure (time, communication, global layers), particularly designed handle state explosion improve collaboration agents. The results show that can significantly outperform general genetic algorithm (GA), RL, hierarchical not only effectiveness on cost consisting travel time penalty but also operation robustness.
منابع مشابه
A goal programming model for vehicle routing problem with backhauls and soft time windows
The vehicle routing problem with backhauls (VRPB) as an extension of the classical vehicle routing prob-lem (VRP) attempts to define a set of routes which services both linehaul customers whom product are to be delivered and backhaul customers whom goods need to be collected. A primary objective for the problem usually is minimizing the total distribution cost. Most real-life problems have othe...
متن کاملA multi-criteria vehicle routing problem with soft time windows by simulated annealing
This paper presents a multi-criteria vehicle routing problem with soft time windows (VRPSTW) to mini-mize fleet cost, routes cost, and violation of soft time windows penalty. In this case, the fleet is heterogene-ous. The VRPSTW consists of a number of constraints in which vehicles are allowed to serve customers out of the desirable time window by a penalty. It is assumed that this relaxation a...
متن کاملAn Effective Search Framework Combining Meta-Heuristics to Solve the Vehicle Routing Problems with Time Windows
Many delivery problems in real-world applications such as the newspaper delivery and courier services can be formulated as capacitated vehicle routing problems (VRPs) [10], which we want to route a number of vehicles with limited capacity in order to satisfy customer requests with the minimal operational cost. This is usually measured by the number of vehicles used multiplied by the total dista...
متن کاملCompetitive Vehicle Routing Problem with Time Windows and Stochastic Demands
The competitive vehicle routing problem is one of the important issues in transportation area. In this paper a new method for competitive VRP with time windows and stochastic demand is introduced. In the presented method a three time bounds are given and the probability of arrival time between each time bound is assumed to be uniform. The demands of each customer are different in each time wind...
متن کاملA Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem
Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Iet Intelligent Transport Systems
سال: 2023
ISSN: ['1751-9578', '1751-956X']
DOI: https://doi.org/10.1049/itr2.12394